(19)B#B#Wf (JP) (12) Q |fj J|# & S < A > 



#^¥11-355747 

(43)^fBB ¥«11¥(1999)12^24B 



(5i)intci.« m>m^ F I 

H0 4N 7/15 H0 4N 7/15 



SSElfcR * »*Jfi©»17 OL C£ 7 S) 



(21)(H»#^ 


^710-161583 


(71)tHEA 


000004237 








B**S«a££tfc 


(22)fflgiB 


7*107(1998) 6 ElOB 




JtOfC«*§K£ETB 7 # 1 # 






(72)3WH# 
























(74) A 





64) imwvzmi vm - &j*mmm&iimmm&mi<>it7 L u\z£!mmm 



(57) [K»] 



41 



44 45 



42 



y 



a 



; O P, 



1-4 



43 



H 



59 



57 56 



53 

r-ife-n 



52 51 



I 

T 

m 



54 



55 

L_ 



ftmW- 1 1 -3 5 5 7 47 



HuiE^twiE^^sffi-rssm^st, sit Lfcs^ 
mriEB*^ • §sa», $$>k, 

it. 

[»*3ji2] fflrEBffctt, iWE&Wfero— 
Bo 

ffct^SrW^fiS^ai: Srtt5!*» • 
SMtSrilxfc^ • ^T&ilffSBT'fcoT, 

K»jis-r-5iiWEHifeS:iWrEisik*a#a*»e>»«u-c* 
5it t -r 5tt*^ 3 is^w^te • m 

Bo 
Bo 

[»#*6] lHrE*F*-!7-Ktt, 
(WfEBfc©*^HJfc*ffi*i-«**IM*& ! *-!7 — K 

MEM&0^<D^T£^i-5^*IT3r--!7- Kt 
«r*tf w t Sr#®t-r-5ft^3/0^ibfS>ft^5*T*ro^ 



iKrE*7*«**sm-t-s*^fli fst . 
sw^nfcME^mf^-sr^f-s^-rs^H^s 

fflTEttffe • **2!«S«W\ 

®&£lMiEl!*& • ^T^f SB tc fa It T^f-T £ B&i£ 

MEMffe • §£««:, £fctc, 

StrEHKlifllrE*^*— 17— KSrS«Ur»ai-5W«t 

15 7^— 

»ffi#S*»e> MR L T UtrEifc&^SB i^** $ 5 StiE 
^^^TWEBIRSr^-rsctSr^MRt-fstt* 
3S6E«W^ • *^iI{§SB. 
20 [lt*^8] liflEB&**#at3\ BtfE^^^^W 
-aMNztfTEllHfcSrSi^-r S £ t t 7 

E*»8Mft • ^pmmmmo 
« -r 5 !«(i«-§T§feflr*a t . 

25 At3 £ L TiUffif 5 

lWE*fclfe«*Sra«i-5iSlllfcfS*S:«#iki: . 

30 t, 

35 ffiEHttttllfrE*)**-!? — KSrtfrElWfc • W*5Mt« 

40 mfE^m^^WE^^-!?- K«rWirs*WB 

WIBfr*-!?- K^ttffl$tufc:4g-&t^^^— !7 — 
Ktc JtJC-T 5 sfFE®^ MEM f"<-*frh StR 
Lt**t5B«**flki: Sr«?L-CV^5 C t ZWmt 

45 -rsiwft • 

50 9 Effico^ • ^^iimSfio 



- 2 - 



ts 

*nE*^«*JSHE#att, lifJtK** L-CWW 

MIES^^^a-r 5 SMS fflBEHfcT*-*-^-* t , 
HMlE^^f ^ t ftulE^lf * k msmma^c^X , 

#HE**^flr-«3-Sr8Wft»#{ls(t# t mr&^itm^ t EE 



WRW 1 1 -3 5 57 47 

fifties Sr BE* b r ffi«B«flr # S: Ul * -T 6 £Ett» i: » 
iWieifefcft* £ «WE*Jf « * 4: fifffEB&fi^Ki'ftx.T, 

mmm&n^-itm t *rtE*«flF-iWbflr * £ fiftieEE*g® 
05 &&&it^xg-mikm&&&m-rz&mmb 

fifTlEiSfcte • ^Sff^Bfi, 

io mm^mn^tm^m^t vxvkmm^tm^rzmt) 
-rsW^'A-i-fbSfc. 

15 MSttiitt^r tS:#S5:i:i-SW#«l Od>ib 
tif*JSl6] tfrlEBftfctt, IM:Wffc*fcW:»iiKfc* 

20 1 5 tl*»K>fS*3SlC|E«<D?*fife • 

*K 

fiftiElfc&S£#J§:M:, HulE^Sr^i-?)^-^*^ 
30 2k 

A. T* V * 5 C t & tttt t -T 6 y- u f^^Bo 
[0 0 0 1] 

35 i5&w<?>m-tz>8iWAm] Bm&t mm 

#*K>^- = ^lc^-e^2>ii(S§SBT*fcoT, fn^S 
40 [0 0 0 2] 

^"9 i^<fc 9 <Crt*©»^", *U£W^iB«tff«S:ffli'^ 

if *5fo5o 
[0 0 0 3] 

50 aSRimil^ibgSlcjg^-rSwi^a^i^fcfe, fcbA^C 



- 3 - 



1 1 -3 5 5 7 47 



*>•?, ^IW*s*»*»5Jb, SfcffeiBffiri^fc g£«t£fcl-t*t 
«fc £> ft <^ ft if <£> F$m& h 5 o 

[0 0 0 4] ^Uif^X— •>3yffiS©5W^A!i 
te^T-#5<£ 9(c-T5i:^5fc(tcotco-efcni^ 09 
x(4\ #PI¥6 - 1 6 9 4 5 6^#(ilEtt$tt5 i 5 

h<DXfoZ> 0 

[0 0 0 5] U*»U**Sfo, #W6- 1 6 9 4 5 6t 
£:f8K:f2«cOt>tf>te, fc< *Tt>Mte£*7*tiH$a»£>ft 

^&aiHd»&*ft^f ^ut*#fg# if 

[0006] *.&w<DVtm • ^pmmmmz* mzm^^ 

[0 0 0 7] 

[^s$rfl?*-r5fctoco#s] • *^ia« 

mm *«ris« -r 5 ^mmu t z. x ^ 

m&mm&t. mmmm^n^mmm^-^mi.x 

[0008] *&m<nmig. ■ ^mismmte. ±m\^tz 

frviz, ^puzm^^x&Tft&fflmztiz*. o\z.-rz>~ 
ttr*#5. i£{f=ggt4, ±ES*«*idiii 



flTSBSrfii*., S«SS«W:, WttSrSffi UT*«i" 

5*WB«»i:S:flBi.rv^5. P — K 

05 ^tttU^tbStrroW^— !7— K(-M^51i*4:i® 

T\ W^^— J?— K(4, ^^LfciMSffttf^ftbi* > m 

[ooo9] ^nroeutt • : gf*mm$iW.<o£ <o 

10 ftWf&irLT, A^J$^fc^5r^m-^^^mUTiH 

iSflrSIti:, !Wfe(i#S:S«-r5iR«lfll#S«»t, * 

15 *&RfcKlS£-r5BMfeW£ffl5i:. Sft$nfc*^«* 
Sr^tcSii-S^S^i: 5 Hfeife • ^FSffi 

20 <Do%f)±b¥ffe<Dmi&&fa'£LXz\(nmiML<o%:ir;$:¥i7ik 

^-<-^.i:, **©jwf-^-^§:ff ^xmiz&titimmz 
[ooio] ± : mviz£?^ &mm&mxm7a£tiz> 

-< # Silt , jHfiffi'J -e«^»J Wf # * Atl-T z> <d iziX X 

[ooii] tmm^&^x, yk®.m^mtn 
40 x^mm ! %\z.x.<om% jii5±?icLtt 

45 [0 0 1 2] huIE*^^— hlzX Vmfe&tlMf&'r 

ix5 <t 5 i-i" -5 r. i: h T-t 5 0 
[0013] «-h(D«t 5#e*:{fe • t^iftglSrffl^ 
50 "C, !fc«l«#afll«|5fw«t«SrJ0l»-r5*>9S:, *^fS 



- 4 - 



ttfflW- 1 1 -3 5 5 7 47 



-#£Mi;LS4?tcu Tw~£mmmzmf&-tzzk 

[0014] 

mtm^m^^^tz^u^mmm^^^x. mE^# 

(0015] H 1 14, *«9lroifcffc • 
*MS0!l0>*J*«rii*-*-B|-C*> S„ 
[0 0 16] Ell (CTj^ttS i 5tC, 7^||jj£0iJT*f4, M 

[0 0 17] Hlfc*SV^T, itt§f4» < B*^^^<9 
* * =t hfWJW * WfeflF 4 

fc^«g<££{f^s/ctoro®{i>^-^-* 3 t:;/4> 
?>H^j2 ; Stbfc1t^SrJEiB|-rSffi«IS4 3Sr«i^TU> 
So B&x-*^— *3K»4, ^df-VT-ftifKJ: 

k> z<omx&^tmfc*i®&.&ifim4 3 caa £*ts ± 

pdLTfcivu htbfrtib-t-^rcomm LttJtfc 
vMBMfffi^i^LfcfcwT-fco-cfc ii\ 

^troffcoff) 4v\, 
[0018] irfkff ^kS4 1 , 4 2 . !£ 

4 4£gT#fi^S-£9#fite£*vr<5i£S&4 5 
LTSm^M5(C[S]!tTjim$tuSo **J£0iJT'lot. 4§ 

[0019] smgiB 5 (4, sm^ttfc^fi-fbm-^-^W: 

5 3 , *^r«4WbB 5 4 , 5 5 "C^neftfc t O 

[0020] ±m^tz^mm^ntimmm^^ #« 
tLS 0 s-Mffefi-i-rau ^roanm^gfiy*»e>ii6m^ 



[0 0 2 1] ftK, £«*<zj^+K*stt5«fefc<o*^ 
7{r4 ;ix So 

[0 0 2 2] aHS«B4KHWC, 0iJx(4IS,BJ<75fc 

^fcv^-g-lcii, iHftSIS4ffl!j*>P>**UfcV^iii&*S» 
giJ-T 5 JbfiHiKSfJ^f- Sr-gf tf # A 

t)9 l^P>A^b^lfti-5o ^(Dm&ZVffe-t&frtbn 
#— K&t'^fbA^ttS 4 iKthS 

15 <fcV\, 

[0 0 2 3] JltiFlWmm^SrSfB Lfc§<f £e« 5 (4, « 

20 14, iI#WI!fc&(;:±<g#&&T*^LTfc.fc^L (Sfe 

{4E1*«B&) . ^HSSr^JUT— mz.i£&&£ltX 
fcJ:V\ — SBt-^Sr-rs^^f4, mi lc^$fLS4: 

25 — * 5 8frb&<9 mZtltz.m&m%-ktZ&l$&5 9icX 

[0 0 2 4] JJctC, #J*Srffll^Jt=¥— 7- FtiiMft 
(OStSU^fflv^fc^t-ov^Tift^-rSo JtlEHSSWcfc 

[0 0 2 5] m 2 (4, ^BJCOS*^ • ^^iifS^gO^ 
35 2©*i6«»jro«^Sr*-t-^ , ns'^B|-Cfc5. 

[0026] mm'mm4&£xy&mmm5<Dffif8.^ m 

1 lc*$^S*5IKW^l <0*lj6fllifa{£IHC-C*fe5o 
{IU *HJSfi»J{C*5VNT(4, S^r«-Hl2« 5 4 <£>*gI3:K 

40 SLTV^So — iHft^a4ffl|[r(4 > ^$*fcV>iIi 
^ 5 fc <Di«SiJ^ A^> f 5 fc to co^fg f4 ^ ■ 

t ft So 

[0 0 2 7] *HJS0i]{-t3V^T{4. tig)H^ (4*— !7 - 
K i: ft S * i i: fc Id £ "f ajlft^fi 4<DjiI^7 = '-^-<- 
45 ^3i:li$tl5. S^^tu/hx— ^tt, ffiffi^4 3T- 

fc, *-7-Ktt*FH»»5 6CfcS»<*ix-C\ iB^ 

t*- ^ ^ t mm-? it e> s o 

50 [0 0 2 8] A>7l*^©»H IMf^4 1 



- 5 - 



ttmW- 1 1-3 5 5 747 



^flM&£:tx5o W ^ 2/6^00^(1, l 1 /^ -*Hfc» 4 
2-C«F#{k*ixT, W»9#;L884 4£^LT#®{k3§4 

[0 0 2 9] S«K«5-e«:, #8£^5 liCj^e^SS 
tt«9W;U*5 2^*»&ix5. ff#fls(*«fctt* BHfc* 

#&$ix£o 

[0 0 3 0] " T\ W^B««5 6Tfi, Atl^tltz 

57m Ba^n^jfes^^i-s^—^SrPifiRT*— 

M^5 8^t>i«U ^rco-r-^ 5 9 

[0 0 3 1] ±*Ufc*fifc^J:aii«afi«Srfflv^*— 
[0 0 3 2] ^f^tt4ffilJTfl, ^»«S<Ofc«)CO»SS 

b^lS ^SifKI 5 fiijco — * ^58 I^Sflfc £ 

rH5S:**j t v> 9 !7 — K«r*tffl"fL 
0Sx.fi rg|5S:*^L*i-0"T?rKT*v\ j t^9f 

[0033] *5t5k, sm^esco* 

T\ 7-^ 7 (a Mf-^-<-^ 5 8 Cfe 

T\ Kt>tttbS;Jx50>-c, SS£ 

[0 0 3 4] ±E*-!7— K*saflHS«5i|-t?afflSix 



05 5 0 

[0 0 3 5] ii^J:^:, ■««r»3ei-S*-!7-K 
[0 0 3 6] ±M<D^m<om 2 (D*JS^J<Dffi^:4rSffi 

[0037] k±rw • ^mmmm 

25 U -o^a6«K«^6»*btLfc*^ 

icj: 9 !7— K*S2KJi^S«««"ettlH 

Six, #S{M611fc*5^T* M«^$tt5<l;9^t 

[0 0 3 8] 

30 [»W<z>3b*l £JUlKiP3 Lfc J: 9 #»w<D«fcte • ^ 

[0 0 3 9] ^ • #)*ii«3S««:. 

40 ^^-^^»*UTS3<CttC±0, $ifLt^?:f 

45 iCb^7 f -^S:Sf§«<0*ff*<. #^rfclW*^S«« 

[m i ] #isiE0>wfe • ^^iim^eco^ i 



- 6 - 



L ' 

ttffiW- 1 1 -3 5 5 7 47 





5 2 




[«f*<dbwi] 


5 3 




i 


=7 


5 4 




2 




5 5 




3 




05 5 6 




4 


CODECOSIfSSB 


5 7 




4 1 




5 8 




4 2 




5 9 




4 3 




6 




4 4 




10 7 




4 5 




8 




5 


CODEC0§<Sa5 


9 1 




5 1 




9 2 





[HI] 



41 



42 







*J3 









43 

L 



20Q 



59 



53 

_J_ 



g^»}W8^A^| 91 

52 51 



g^ftW 



54 



92 



55 



1 

< 




41 
















2 
( 




42 
















3 




43 


i H* 






MS 



[12] 
t 



45 



H 



7 



59 



era 



53 



58 



Sfit 



52 



51 



54 



55 



J — H H 



- 7 - 



Searching I^A J 



1/2 ^- V 



PATENT ABSTRACTS OF JAPAN 



(1 1 publication number : 1 1 -355747 

(43)Date of publication of application : 24.12.1999 



(51)Int.CI. 




H04N 7/15 




(21 Application number : 


10-161583 


(71)Applicant 


NEC CORP 


(22)Date of filing : 


10.06.1998 


(72)Inventor : 


KAMURA YUKARI 



JL 



1 



~ 5*3 1 



42 



43 

-IL 

























1-- 


O 













I 



ttl 



(54) VIDEO/SOUND COMMUNICATION EQUIPMENT AND VIDEO CONFERENCE EQUIPMENT 
USING THE SAME EQUIPMENT 

(57)Abstract: 

PROBLEM TO BE SOLVED: To unnecessitate 
transmission of a picture signal to be subordinately used 
during a conference by preliminarily transmitting a 
picture to the receiver side by adding an identifier to it 
and registering it at the transmitter side, selecting a 
picture corresponding to the identifier from a picture 
database and displaying it at the receiver side when the 
identifier is transmitted during the conference. 
SOLUTION: A video, a sound and a picture signal 
obtained by picture and sound encoders 41, 42 and a 
compressor 43 from a camera 1 , a microphone 2 and a 
picture database 3 are processed, multiplexed and 
transmitted by the transmitter side. The received 
multiplexed signal is separated by a separator 51 and an 
original signal is outputted to a monitor 6, a speaker 7 
and a picture database 58 via a switch 52 by picture and 
sound decoders 53, 54 and an extension device 55 at 
the receiver side. The picture signal is preliminarily 
transmitted to the reception side, and registered there 
by adding the identifier to it at display control signal input 91 by the transmission side and when 
the identifier is transmitted during the conference, the corresponding picture is displayed on the 
monitor 6. Thus, a subordinate picture is displayed without using paper. 
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* NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

1 This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2.**** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



CLAIMS 



[Claim(s)] 

[Claim 1] The sending set which transmits an image and voice The image and voice receiving set 
which has the aforementioned image, a receiving means to receive the aforementioned voice, and 
the image which received and a reproduction means to reproduce voice They are an image and 
speech-communication equipment equipped with the above, the aforementioned image and voice 
sending set Furthermore, it has a display-control signal transmitting means to transmit the 
display-control signal which specifies [ from ] a specific picture to be a picture transmitting 
means to transmit a picture among the aforementioned pictures, and directs the display of this 
picture. The aforementioned image and voice receiving set are characterized by having further 
an image storage means to receive and accumulate the aforementioned picture, and an image 
display means to choose from the aforementioned image storage means the picture which 
received the aforementioned display-control signal and was specified, and to replace with and 
display on the aforementioned image. 

[Claim 2] The aforementioned picture is the image and speech-communication equipment 
according to claim 1 characterized by what is displayed on a part of aforementioned image. 
[Claim 3] The sending set which transmits an image and voice The image and voice receiving set 
which has the aforementioned image, a receiving means to receive the aforementioned voice, and 
the image which received and a reproduction means to reproduce voice They are an image and 
speech-communication equipment equipped with the above, the aforementioned image and voice 
sending set It has a picture transmitting means to transmit the picture corresponding to a voice 
keyword, furthermore, the aforementioned image and voice receiving set Furthermore, if it has an 
image storage means to receive and accumulate the aforementioned picture, and a speech 
recognition means to recognize the aforementioned keyword from the aforementioned voice 
which received and the aforementioned voice keyword is extracted, it will be characterized by 
choosing the aforementioned picture corresponding to this voice keyword from the 
aforementioned image storage means, and displaying it. 

[Claim 4] The aforementioned picture is the image and speech-communication equipment 
according to claim 3 characterized by what is replaced with and displayed on the aforementioned 
image. 

[Claim 5] The aforementioned picture is the image and speech-communication equipment 
according to claim 3 characterized by what is displayed on a part of aforementioned image. 
[Claim 6] The aforementioned voice keyword is an image and speech-communication equipment 
given in one from the claim 3 characterized by including the display start keyword which directs 
the start of a display of the aforementioned picture, and the display end keyword which directs 
the end of a display of the aforementioned picture to a claim 5 of claims. 

[Claim 7] A video-signal transmitting means to change the inputted image into a video signal and 
to transmit The image and voice sending set which has a sound signal transmitting means to 
change the inputted voice into a sound signal and to transmit A video-signal receiving means to 
receive the aforementioned video signal A sound signal receiving means to receive the 
aforementioned sound signal The image and voice receiving set which has an image-reproduction 
means to reproduce the received aforementioned video signal on an image, and to display on the 
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graphic display section, and a voice reproduction means to reproduce the received 
aforementioned sound signal to voice They are an image and speech-communication equipment 
according to claim 6 equipped with the above, the aforementioned image and voice sending set 
Furthermore, a picture transmitting means to turn a picture to the aforementioned image and 
voice receiving set, and to transmit, It has a display-control signal transmitting means to 
transmit the display-control signal which specifies [ from ] a specific picture among the 
aforementioned pictures, and directs the display of this picture, the aforementioned image and 
voice receiving set Furthermore, it is characterized by replacing with the aforementioned image 
which chooses from the aforementioned image storage means the image database which 
receives and accumulates the aforementioned picture and the aforementioned voice keyword, 
and the picture which received the aforementioned display-control signal and was specified, and 
is displayed on the aforementioned graphic display section, and displaying the aforementioned 
picture. 

[Claim 8] The aforementioned image display means is the image and speech-communication 
equipment according to claim 7 characterized by displaying the aforementioned picture on a part 
of aforementioned graphic display section. 

[Claim 9] A video-signal transmitting means to change the inputted image into a video signal and 
to transmit The image and voice sending set which has a sound signal transmitting means to 
change the inputted voice into a sound signal and to transmit A video-signal receiving means to 
receive the aforementioned video signal A sound signal receiving means to receive the 
aforementioned sound signal The image and voice receiving set which has an image-reproduction 
means to reproduce the received aforementioned video signal on an image, and a voice 
reproduction means to reproduce the received aforementioned sound signal to voice They are an 
image and voice transmission equipment equipped with the above, the aforementioned image and 
voice sending set It has a picture transmitting means to turn the aforementioned picture and the 
aforementioned voice keyword to the aforementioned image and voice receiving set, and to 
transmit, furthermore, the aforementioned image and voice receiving set Furthermore, the image 
database which receives and accumulates the aforementioned picture and the aforementioned 
voice keyword, It is characterized by having a speech recognition means to recognize the 
aforementioned voice keyword from the aforementioned sound signal, and an image display 
means to choose the aforementioned picture corresponding to this voice keyword from the 
aforementioned image database, and to display it when the aforementioned voice keyword is 
extracted. 

[Claim 10] They are the image and speech-communication equipment according to claim 9 
characterized by equipping the aforementioned image-reproduction means with the graphic 
display section which displays the aforementioned image, replacing the aforementioned image 
display means with the aforementioned image displayed on the aforementioned graphic display 
section, and displaying the aforementioned picture. 

[Claim 1 1] They are the image and speech-communication equipment according to claim 9 which 
the aforementioned image-reproduction means is equipped with the graphic display section 
which displays the aforementioned image, and is characterized by the aforementioned image 
display means displaying the aforementioned picture on a part of aforementioned graphic display 
section. 

[Claim 12] An image and speech-communication equipment given in one from a claim 8 to a 
claim 1 1 which is characterized by providing the following of claims The aforementioned speech 
recognition means is a start directions keyword which directs the start of a display of the 
aforementioned picture further. It is an image display start means by which have the end 
directions keyword which directs the end of a display of the aforementioned picture, and the 
aforementioned image display means starts the display of the aforementioned picture when the 
aforementioned start directions keyword is extracted. An image display end means to end the 
display of the aforementioned picture when the aforementioned end directions keyword is 
extracted 

[Claim 13] The image and speech-communication equipment according to claim 7 or 9 
characterized by providing the following The aforementioned image and voice sending set are a 
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transmitting-side image database which accumulates the aforementioned picture further 
including the voice coding machine which the aforementioned sound signal transmitting means 
encodes the aforementioned voice, and outputs a voice coded signal including the image coding 
machine which the aforementioned video-signal transmitting means encodes the aforementioned 
image, and outputs an image coded signal. The compressor which compresses the 
aforementioned picture and outputs a compression picture signal It is the eliminator with which it 
replaces with the aforementioned video signal, the aforementioned sound signal, and the 
aforementioned picture signal, and has the multiplexing machine which multiplexes the 
aforementioned image coded signal, the aforementioned voice coded signal, and the 
aforementioned compression picture signal, and transmits a multiplexed signal, and the 
aforementioned image and voice receiving set divide the aforementioned multiplexed signal into 
an image coded signal, a voice coded signal, and a compression picture signal further, 
respectively. The image decryption machine which decrypts the aforementioned image coded 
signal and outputs an image decryption signal, the voice decryption machine which decrypts the 
aforementioned voice coded signal and outputs a voice decryption signal, and the expandor 
which elongates the aforementioned picture coded signal and outputs an extension picture signal 

[Claim 14] The aforementioned image-reproduction means is the image and speech- 
communication equipment according to claim 8 or 1 1 characterized by having the synthetic 
vessel which compounds the picture signal and the aforementioned image decryption signal 
which were selected by the aforementioned voice keyword and taken out from the image 
database. 

[Claim 15] An image and speech-communication equipment given in one from a claim 10 to a 
claim 14 which is characterized by providing the following of claims The aforementioned image 
and voice sending set are a transmitting-side image database which accumulates the 
aforementioned picture further including the voice coding machine which the aforementioned 
sound signal transmitting means encodes the aforementioned voice, and outputs a voice coded 
signal including the image coding machine which the aforementioned video-signal transmitting 
means encodes the aforementioned image, and outputs an image coded signal. The compressor 
which compresses the aforementioned picture and outputs a compression picture signal It is the 
eliminator with which it replaces with the aforementioned video signal, the aforementioned sound 
signal, and the aforementioned picture signal, and has the multiplexing machine which multiplexes 
the aforementioned image coded signal, the aforementioned voice coded signal, and the 
aforementioned compression picture signal, and transmits a multiplexed signal, and the 
aforementioned image and voice receiving set divide the aforementioned multiplexed signal into 
an image coded signal, a voice coded signal, and a compression picture signal further, 
respectively. The image decryption machine which decrypts the aforementioned image coded 
signal and outputs an image decryption signal, the voice decryption machine which decrypts the 
aforementioned voice coded signal and outputs a voice decryption signal, and the expandor 
which elongates the aforementioned picture coded signal and outputs an extension picture signal 

[Claim 16] The aforementioned picture is an image and speech-communication equipment given 
in one from the claim 1 characterized by including a static image, a dynamic image, or an 
alphabetic data to a claim 15 of claims. 

[Claim 1 7] The aforementioned voice reproduction means is TV-conference equipment 
characterized by to be included the loudspeaker which outputs the aforementioned voice 
including the monitor with which the aforementioned image-reproduction means displays the 
aforementioned image including the microphone with which voice is mentioned in the 
aforementioned sound signal transmitting means including the camera with which it is TV- 
conference equipment which equipped one from a claim 7 to a claim 16 of claims with the image 
and the speech-communication equipment of a publication, and an image is mentioned in the 
aforementioned video-signal transmitting means. 



http://www4jpdl.jpo.gojp/cgi-bin/tran_web_cgi_ejue?u=http%3A%2F%2Fwww6.ipdl.j 2003/12/19 



[Translation done.] 



4/4 /<— v 



http://www4.ipdl jpo.go.jp/cgi-bin/tran_web_cgi_ejje?u=http%3A%2F%2Fwww6.ipdl jp... 2003/1 2/1 9 



1/6 ^— v 



* NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

LThis document has been translated by computer. So the translation may not reflect the original 
precisely. 

2.**** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[The technical field to which invention belongs] this invention is the communication device which 
also transmits an image and the picture which prepared voice beforehand especially about the 
image and speech-communication equipment which communicates in addition to an image and 
voice, and can display this picture on the other party's monitor suitably, and is applied to a video 
conference system etc. 
[0002] 

[Description of the Prior Art] Although a partners image is seen in many cases during the 
meeting when using a TV conference generally, in the case of a content which is unclear, it may 
be effective in employment of a meeting only with voice to use image information additionally. 
For example, there is a case where he wants to display and explain data, such as a chart, to the 
other party in the middle of a meeting etc. 
[0003] 

[Problem(s) to be Solved by the Invention] However, in such a case, at the conventional TV 
conference, auxiliary information is beforehand sent by FAX etc. and eye a difficult hatchet and 
the method of choosing required supplementary information with the directions from a remote 
side are taken for choosing these supplementary information from a remote side as freedom. By 
this method, the information on paper will surely be needed auxiliary. Moreover, the side given 
based on the other party, i.e., supplementary information, needs to sort out and look at the 
supplementary information which corresponds after receiving the directions by the side of an 
explainer through a video conference system, and has the problem of time and effort having to 
let an image screen out of sight this top. 

[0004] If it can be made to perform a transfer [ real time / information / presentation ] /, a 
presentation conference system which is indicated by JP,6-1 69456,A is known, for example. This 
conference system transmits and records the presentation information made in the editorial 
department on the Records Department Qf the terminal of a meeting member, before a meeting 
starts, and it is a thing of making a presentation perform to a statement part only by sending 
necessary directions at the time of a meeting. 

[0005] However, the thing given in JP,6-169456,A transmits beforehand the presentation 
information which consists of a picture and speech information to the last to the presentation 
terminal side, and does not pass over it to operate from remoteness the timing which performs a 
presentation, but has the problem that it cannot be said that a TV conference etc. replaces 
required image information etc. with an image suitably in the middle of, and displays on a monitor. 

[0006] The image and speech-communication equipment of this invention aim at replacing 
auxiliary image information with the usual image timely, or enabling it to display simultaneously, 
without using paper. 
[0007] 

[Means for Solving the Problem] The sending set which transmits an image and voice in order 
that the image and speech-communication equipment of this invention may solve the above- 
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mentioned problem, They are an image and speech-communication equipment equipped with the 
receiving set which has an image, the receive section which receives voice, and the image which 
received and the reproduction section which reproduces voice, the above-mentioned sending set 
It has the display-control signal transmitting section which transmits the display-control signal 
which specifies [ from ] a specific picture among pictures to be the picture transmitting section 
which furthermore transmits a picture, and directs the display of this picture. On the other hand, 
the receiving set is equipped with the image storage section which receives and accumulates a 
picture further, and the image display section which chooses from the image storage section the 
picture which received the display-control signal from a sending set, and was specified, and 
displays it. And when an image and a voice receiving set receive a display-control signal, it is 
characterized by displaying the picture which it replaced [ picture ] with the image concerned 
and had the display specified by the graphic display section. Even if it replaces this display with 
an image and it is displayed on the whole graphic display section, it may be made to be displayed 
on a part of image. 

[0008] Based on voice, a display can be controlled instead of the image and speech- 
communication equipment of this invention inputting the display-control signal which specifies 
the picture mentioned above by the sending set side. That is, in addition to the above-mentioned 
basic composition, the sending set was equipped with the picture transmitting section which 
transmits the picture corresponding to a voice keyword, and, on the other hand, the receiving set 
is equipped with the image storage section which receives and accumulates a picture, and the 
speech recognition section which recognizes a keyword from the voice which received. If a voice 
keyword is extracted from voice, the picture corresponding to this voice keyword is chosen from 
the image storage section, and it is made to display it. You may make it a voice keyword direct a 
start or end of not only a picture but a display to display here. 

[0009] The video-signal transmitting section which changes the inputted image into a video 
signal as more concrete composition of the image and speech-communication equipment of this 
invention, and transmits, The image and voice sending set which has the sound signal 
transmitting section which changes the inputted voice into a sound signal and transmits, An 
image and voice transmission equipment equipped with the image and the voice receiving set 
which has the video-signal receive section which receives a video signal, the sound signal 
receive section which receives a sound signal, the image-reproduction section which reproduces 
the received video signal on an image, and the voice reproduction section which reproduces the 
received sound signal to voice are used. And it is made for an image and a voice sending set to 
be equipped with the display-control signal transmitting section which transmits the display- 
control signal which specifies [ from ] a specific picture further to be the picture transmitting 
section which turns a picture to an image and a voice receiving set, and transmits among 
pictures, and directs the display of this picture. On the other hand, the image and the voice 
receive section are equipped with the image display section which chooses from the image 
storage section the image database which receives and accumulates a picture and a voice 
keyword, and the picture which received the display-control signal and was specified, and 
displays it further. 

[0010] When it replaces with inputting a display-control signal by the transmitting side, a 
receiving set is equipped with the speech recognition section which recognizes a voice keyword 
from a sound signal and a voice keyword is extracted, the picture which should be displayed by 
the receiving set side chooses the picture corresponding to this voice keyword from an image 
database, and you may make it display it on the image display section, as mentioned above. 
[0011] In such composition, the video-signal transmitting section has the image coding machine 
which encodes an image and outputs an image coded signal, and the sound signal transmitting 
section has the voice coding machine which encodes voice and outputs a voice coded signal. 
Moreover, a picture is transmitted according to the compression picture signal compressed by 
the compressor. Furthermore, an image, voice, and a picture multiplex an image coded signal, a 
voice coded signal, and a compression picture signal, and may be made to be transmitted by the 
multiplexed signal. It is separated into an image coded signal, a voice coded signal, and a 
compression picture signal by the eliminator at a receiving set side, respectively, and a 
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multiplexed signal is further reproduced by the signal of a basis by the image decryption machine, 

the voice decryption machine, and the expandor, respectively. 

[0012] The picture signal and image decryption signal which were selected by the 

aforementioned voice keyword and taken out from the image database are compounded with a 

synthetic vessel, and can be simultaneously displayed on the graphic display section. 

[0013] It has the loudspeaker which outputs voice for the monitor which displays an image for 

the microphone which records voice for the camera which records an image on the video-signal 

transmitting section on the sound signal transmitting section on the image-reproduction section 

to the voice reproduction section using the above image and speech-communication equipments, 

and TV conference equipment can also be constituted. 

[0014] 

[Embodiments of the Invention] Next, the TV conference equipment using the image and the 
speech-communication equipment, and this equipment of this invention is explained in detail 
below with reference to a drawing. 

[0015] Drawin g 1 is drawing showing the composition of one example of the image and speech- 
communication equipment of this invention. 

[0016] As shown in drawin g 1 , in order to make an understanding easy, by this example, it is 
illustrated as the sending set 4 is constituted from a transmission line 45 which connects both 
with a receiving set 5. What is necessary is just to arrange bidirectionally the communication 
device of this example shown in drawing 1 , since both sides are usually equipped with a sending 
set and a receiving set when applied to actual TV conference equipment and an actual 
communication device. 

[0017] The sending set 4 is equipped with the picture coding machine 41 and the voice coding 
machine 42 which encode the camera 1 which incorporates an image, the microphone 2 which 
incorporates voice and the incorporated image, and voice, respectively in drawing 1 . 
Furthermore, it has the compressor 43 which compresses the information incorporated from the 
image database 3 for transmitting a picture showing the other party apart from the usual image 
for using at a TV conference etc., and here. In addition, a required picture may be made to be 
inputted on that spot with a scanner etc. by the compressor 43 one by one, and all image 
information to transmit may be beforehand recorded on an image database 3. A picture here may 
be not only a static image but a dynamic image, and seems moreover, to include alphabetic 
information, such as a text, and a table, a graph. 

[0018] The image, the voice, and the picture signal which were acquired by the picture coding 
machine 41, the voice coding machine 42, and the compressor 43 are multiplexed with a 
multiplexing vessel through a switcher 44, and are transmitted towards a receiving set 5 through 
a transmission line 45. It is made to be beforehand transmitted to a receiving set 5 previously in 
this example about the image information which it is going to use additionally. 
[0019] The receiving set 5 is equipped with the eliminator 51 which divides the received 
multiplexed signal into a video signal, a sound signal, and a picture signal, respectively, the 
switcher 52, and the picture decryption machine 53, the voice decryption machine 54 and an 
expandor 55. The video signal of a basis, a sound signal, and a picture signal are reproduced, 
respectively by the picture decryption machine 53, the voice decryption machine 54, and the 
expandor 55. 

[0020] The picture signal which was mentioned above and which was transmitted previously is 
accumulated at an image database 58, after being elongated by the expandor 55. In addition, the 
identifier is attached so that each picture signal can respond to directions of the display 
transmitted from a next sending set side. That is, in the sending set side, the picture beforehand 
used at a meeting etc. is registered into the image database 3 with the identifier. The registered 
picture signal is registered to the image database 58 of the partner point, after being 
compressed by the compressor 43, being transmitted to the partner point and elongated by the 
expandor 55. 

[0021] Next, the display action of the image under implementation of a meeting etc. is explained. 
During a meeting, the video signal usually photoed from the camera 1 of a sending set 4 is 
transmitted, and it is displayed on the monitor 6 in a receiving set 5 side. Moreover, it is similarly 
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transmitted to a receiving set 5 side, and the voice by the side of a sending set 4 is also 
reproduced by the loudspeaker 7. 

[0022] The display-control signal containing the above-mentioned identifier which discriminates 
a picture to display from a sending set 4 side is inputted from the display-control signal input 91 
and it transmits to use now the picture beforehand registered and put on the receiving set 5 side 
by the sending set 4 side for explanation. The identifier for specifying this picture should just be 
inputted from a keyboard etc. 

[0023] The picture which corresponds out of an image database 58 with the display-control 
signal which was divided into the display-control signal output 92, and was extracted here is 
selected, and the receiving set 5 which received the display-control signal is displayed on a 
monitor 6. In addition, this display may completely be transposed to the usual image, and may be 
displayed on it, and the display screen may be divided (about the monitor concerned in the case 
of displaying on a monitor different from an image, it is an illustration abbreviation), and you may 
make it display in part. When displaying in part, as shown in drawing 1 , the video signal from the 
picture decryption machine 59 and the picture signal taken out from the image database 58 will 
be compounded with the synthetic vessel 59. 

[0024] Next, the display using discernment of the picture by the keyword using voice is 
explained. Although the picture which you want to display carries out the direct input of the 
identifier and being specified in the above-mentioned example, with the image and speech- 
communication equipment of this invention, the keyword for specifying a picture is extracted 
from voice, and the feature is in the point of specifying the screen which should be displayed by 
this keyword. 

[0025] Drawin g 2 is the block diagram showing the composition of the 2nd example of the image 
and speech-communication equipment of this invention. 

[0026] The composition of a sending set 4 and a receiving set 5 is almost the same as the 1st 
example of this invention shown in drawing 1 . However, in this example, the point which equips 
the latter part of the voice decryption machine 54 with the speech recognition machine 56 and 
the data distinction machine 57 is different. On the other hand, the function for inputting the 
identifier for specifying the picture which you want to display as a sending set 4 side becomes 
unnecessary. 

[0027] In this example, an auxiliary picture is first registered into the image database 3 of a 
sending set 4 with the voice used as a keyword. The registered data are registered to the image 
database 58 of the partner point, after being compressed by the compressor 43, being 
transmitted to the partner point and elongated by the expandor 55. Moreover, a keyword is 
registered also into the speech recognition machine 56, and is related with an image database. 
[0028] It encodes with the picture coding vessel 41, and the image from a camera 1 is supplied 
to the multiplexing machine 45 through the change machine 44. It encodes with the voice coding 
vessel 42, and the voice from a microphone 2 is supplied to the multiplexing machine 45 through 
the change machine 44. An image and voice, and data are changed with the change vessel 44. 
With the multiplexing vessel 45, the aforementioned coding image and the aforementioned coding 
voice are multiplexed with the multiplexing vessel 45, and are transmitted to a receiving set 5 
through a transmission line 45. 

[0029] In a receiving set 5, the multiplexed signal transmitted through the transmission line 45 by 
the eliminator 51 is divided into a coding image and coding voice, and a coding image and coding 
voice are further sent to the change machine 52. A coding image is supplied to the picture 
decryption machine 53, is decrypted, and is supplied to the synthetic vessel 59. On the other 
hand, coding voice is supplied to the voice decryption machine 54, is decrypted and is supplied 
to the speech recognition machine 56 and a loudspeaker 7. 

[0030] Here, the inputted voice is recognized, and when the same thing as the keyword 
registered beforehand is found, it consists of speech recognition machines 56 so that the result 
may be sent to the data judging machine 57. With the data judging vessel 57, the data 
corresponding to the recognized result are chosen from an image database 58, and the data is 
supplied to the synthetic vessel 59. In the synthetic vessel 59, the aforementioned decryption 
picture and the aforementioned data are compounded and it is sent to a monitor 6. In a monitor 
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6, the synthetic screen of the transmitted image and the data chosen by voice is displayed. 
[0031] It explains a little more concretely about selection of the picture by the keyword using 
the communication device by the composition mentioned above, and a display. 
[0032] In a sending set 4 side, it encodes with the coding vessel 42 and the voice said for the 
meeting etc. is transmitted to a receiving set 5 side. The language by means of which it was 
beforehand set for specifying and displaying a picture is emitted to display [ speaker ] a picture 
on the monitor of a receiving set 5 auxiliary because of explanation etc. auxiliary now. For 
example, two or more drawings and photographs are beforehand registered into the image 
database 58 by the side of a receiving set 5, and, for example, turn to a microphone 2 the word 
"have a look since drawing 5 is displayed", and it emits to display drawing 5 on a monitor 6 from 
this inside. [ the language containing the keyword of "displaying drawing 5" ] 
[0033] If it does so, a sound signal is decrypted with the voice decryption vessel 54 of a 
receiving set 5, and since the keyword which specifies the picture of "drawing 5" with the 
speech recognition vessel 56 is contained, it will choose drawing 5 from the pictures beforehand 
registered into the image database 58 with the data distinction vessel 57. And since the keyword 
"a display" is also extracted, the picture concerned will be displayed on a monitor 6. In addition, 
this display may be replaced with and displayed on the image currently displayed on the monitor 
6 like the 1st example, and may be displayed in part. 

[0034] When the display of the picture to which the above-mentioned keyword is extracted by 
the receiving set 5 side, and corresponds is started (for example, when the keyword which 
terminates the display of a picture, such as "an end", is registered and this voice is extracted), 
the display of an auxiliary picture will be completed, and it is automatically returned to the image 
of a basis. 

[0035] As mentioned above, although the keyword which discriminates and controls a start and 
end of a display to everything but the keyword which specifies a picture may be prepared, a 
display is automatically started only by the keyword which specifies a picture, and you may make 
it return to the image screen of a basis automatically, and may make it have the keyword which 
directs an end after a fixed time display only in an end. 

[0036] By adopting the composition of the 2nd example of the above-mentioned this invention, 
data can be chosen timely without the operation by the side of a sink by accumulating the 
auxiliary image data which became a keyword and a pair to the image database, choosing the 
auxiliary image data corresponding to the keyword of the voice which received, compounding 
with the image which received, and displaying. 

[0037] Although the image and speech-communication equipment of this invention explained 
above explained to the example the communication device from which the sending set and the 
receiving set became a couple, a TV conference is constituted, as a sending set and a receiving 
set are usually made each terminal at a couple and an image and voice can be communicated 
bidirectionally. Moreover, although the communication form only between two terminals was 
mentioned as the example in the above-mentioned example in order to make an understanding of 
invention easier, TV conference equipment is constituted among three or more terminals, the 
keyword of image display is extracted with two or more receiving sets by the voice uttered from 
one sending set, and it cannot be overemphasized in each receiving set that a picture can be 
displayed. 
[0038] 

[Effect of the Invention] Since the image and speech-communication equipment of this invention 
attach the identifier for discriminating a picture beforehand, and transmits and registers it into 
the receiving set side, and the picture concerned is chosen from the image database by the side 
of a receiving set by transmitting an identifier during a meeting, as explained above, and it is 
made to display, the transmission of a picture signal which it is going to use auxiliary becomes 
unnecessary during a meeting. For this reason, the picture which you want to display on a 
receiving side in an instant can be displayed now. 

[0039] Furthermore, by registering into the image database the auxiliary image which became a 
keyword and a pair beforehand, the image and speech-communication equipment of this 
invention can recognize the voice which received by speech recognition, can choose the data 
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corresponding to the voice from an image database, and can replace them with a receiving image, 
or the composition compounded and displayed can also be used for it. According to this 
composition, at a TV conference etc., there is no operation of the data corresponding to the 
voice of a transmitting side of a receiving side, and it can display now on a receiving side 
simultaneously with voice. 



[Translation done.] 
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DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] It is the block diagram showing the composition of the 1st example of the image and 
speech-communication equipment of this invention. 

[Drawing 2] It is the block diagram showing the composition of the 2nd example of the image and 
speech-communication equipment of this invention. 
[Description of Notations] 

1 Camera 

2 Microphone 

3 Image Database 

4 Transmitting Section of CODEC 
41 Picture Coding Machine 
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